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Abstract 

We develop and analyze a dilated high performance fault tolerant fast packet multistage 
interconnection network (MIN) in this paper. In this new design, the links at the input and the 
output stages of a dilated banyan-based MIN are rearranged to create multiple routes for each 
source-destination pair in the network after removing one stage in the network. These multiple 
paths are link- and node-disjoint. Fault tolerance at low latency is achieved by sending multiple 
copies of each input packet simultaneously using different routes and different priorities. This 
guarantees that high throughput is maintained even in the presence of faults. Throughput is 
analyzed using simulation and analysis and we show that the new design has considerably higher 
performance in the presence of a faulty switching element (SE) or link in comparison to dilated 
networks. We also analyze the reliability and show that the new design has superior reliability 
in comparison to competing proposals. 
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1 Introduction 


High dependability is required in communication for multiprocessor and communication systems. 
For example, high bandwidth transmission systems to carry high volumes of video, voice, and 
data in Broadband Integrated Services Digital Networks (B-ISDN) using Asynchronous Transfer 
Mode (ATM) are becoming common. It remains a challenge to integrate reliability while main- 
taining high throughput in switches for B-ISDN. Most switching architectures proposed [l]-[9] 
use self-routing, space-switching, and internal nonblocking paradigms. Banyan-based Multistage 
Interconnection Networks (MINs) such as Omega and Generalized-cube (GC) [1] have received 
considerable attention due to their favorable cost /performance ratio. 

High blocking probability and low throughput, however, greatly limit their capability of 
handling fast packet switching due to internal contention. Many schemes to reduce blocking 
probability and increase throughput have been developed [2]- [9]. Batcher- banyan networks [2], 
internal speed up [3], replication in series (Tandem banyan) [4, 5] or in parallel [6], link dilation in 
MIN [7, 8], and multi-priority traffic [9] are some of them. The maximum throughput achievable 
with head-of-line (HOL) collision [10] remains at 0.58. None of these can tolerate failures. To 
overcome this problem, redundant paths in a MIN are provided by adding extra stages or links 
[11]- [16]. The performance of these networks in the presence of faults is affected adversely and is 
too low for fast packet switching for B-ISDN. Altogether, most of these methods are not suitable 
for fast-packet switching. 

Our proposal to achieve fault tolerance without sacrificing performance is to use dilated net- 
works and rearrange input /output links to provide redundant paths between a source/destination 
pair. We show that by modifying the input and output stages of a dilated network, high perfor- 
mance, low cost, and fault-tolerance can be achieved at the same time. In particular, we discuss 
the role of dilation in fault tolerance in Section 2 and develop a space-division fast packet design, 
called the dilated reduced-stage Multistage Interconnection Network (DIRSMIN) in Section 3. 
We evaluate the performance of this network and present our simulation results in Section 4. 
In Section 5, we establish the analytical model to compute the performance of DIRSMIN. In 
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Section 6, we develop the reliability model of DIRSMIN. In Section 7, we summarize the main 
results and discuss some possible extensions. 

2 Dilation and Fault Tolerance 

An N X N multistage banyan network consists of log N (base 2) stages as shown in Figure 

la. Each stage has N/2 2x2 SEs. Different interconnection functions can be used to yield 
different topologies such as generalized cube and Omega network as shown in Figures la and 

lb, respectively. An Omega network can be redrawn, as shown in Figure le, to show that it is 
equivalent to a cube network. So we consider only Omega networks in our discussion. 

To reduce contention and improve the performance of each link, an Omega network can be 
dilated by d to form a d-dilated network [7, 8]. It has been shown by us and other researchers that 
to meet the low-latency and high performance requirement, d = log log N dilation is optimal. All 
stages use d-dilated 2 X 2 or 2d X 2d SEs. A packet entering a SE may exit using any of the d links 
going to the desired SE in the next stage. The cost considerations limit the degree of dilation. 
Kumar and Jump [8] have shown that the dilated networks always have higher performance than 
other comparable schemes. Dilation by itself does not provide the fault tolerance in the network 
in the presence of faulty SEs. The dependability can be provided in a dilated banyan network 
by modifying and rearranging the input and output connections to create multiple paths using 
the strategy described below. 

To tolerate input/output link failures, each source and destination must be connected to 
multiple ports. Each source may feed data through up to p different input ports and each 
destination receives data from q different output ports. In a non-dilated banyan network, we 
need an in-mux N X p to N stage and an out-demux N to N X q stage to match the number 
of input and output ports of the network with the number of source/destinations as shown in 
Figure lc. The banyan-based MIN remains unchanged. This structure employing p input and 
q output links is called an extra-link MIN or ELMIN(p, q). 
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By suitably choosing the I/O connections, we can create up to p X q < N/2 possible paths 
between each source/destination pair, at least min(p^q) of them entirely independent. Let bit 
patterns s n _i • • • s 0 and d n _i • • • d 0 be the binary strings representing the addresses of source 
S and destination D. Typically, in MINs, source S = s n _i • • • s 0 is connected to network 
input / = s n _i • • • 5 0 . Similarly, a destination D = d n _i • • • d 0 is connected to network output 
0 = d n _ i • • • do- In ELMIN(p, q) } a source S is connected to p (assuming p is a power of two) 
input ports of the network whose addresses are derived by using all possible combinations of the 
least significant logp bits of the source address. Similarly, a destination receives from q output 
ports whose addresses are derived by using all possible combinations of the most significant 
log q bits of its address. An ELMIN(2, 2) derived using an Omega network is shown in Figure Id 
without in-mux/out-demux stages. The in-mux and out-demux can be removed by substituting 
2p X 2 and 2x2 q SEs in the input and the output stages, respectively. The most appropriate 
values seem to be p = q = 2. 

Multiple path MINs can also be used to improve the performance of the switch by balancing 
the load over these paths. In this scheme, a cell is sent through all available paths with different 
priorities. In case of contention, a cell being routed on the primary path has higher priority. 
Thus, the basic capabilities of a MIN are not affected. However, additional cells may be routed 
through the secondary paths, improving the overall performance. We use this scheme in dilated 
banyan networks after rearranging the links to achieve higher throughput in our design, presented 
in the next section. 


3 Dilated Reduced-Stage MIN (DIRSMIN) 

To tolerate failure of an internal SE node, a combination of dilation and the ELMIN type 
connection scheme offers a very attractive option [17]. Because of dilation, extra links coming 
from source and to destination nodes can be directly connected to SEs without incorporating 
any in-mux or out-mux stages as shown in Figure Id. Also, due to multiple ports connecting to 
a destination, the switching of the most significant bit in routing in the banyan network in the 
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first stage is not required. Thus, the first stage can be eliminated. This log N — 1 stage dilated 
MIN is called a Dilated Reduced Stage MIN (DIRSMIN) network. The stages from the input 
to the output are numbered as stage (n — 2) to stage 0. The last stage uses dilated 2x4 SEs 
to provide adequate routing. The rest of the stages use dilated 2x2 SEs. DIRSMIN reduces 
the delay in the network as there is one less switching stage. Moreover, interestingly in our 
simulations we noticed that the throughput improves due to reduction in stages. 

3.1 Input/Output Connections. 

We develop an example design with four inputs and two dilated output links to provide adequate 
performance and fault tolerance. The links in non-dilated banyan are numbered from 0 to N-l 
from top to bottom at the input and output of each stage independently. In ELMIN, each source 
is connected to the two network ports specified by s n -\s n -2 • • • Sis 0 and s n -\s n -2 • • • Si^o. After 
the shuffle, these ports are connected to links s n _ 2 • • • sis 0 s n _i and s n _ 2 • • • , respec- 

tively. Each of the corresponding SEs can switch this link to two output links that are specified 
by s n _ 2 • • • an d -s n -2 • • • SiSosWT for the original connection and s n _ 2 • • • SiSo-Sn-i, and 

■s n _2 • • • SiSo^n-i for the second connection in ELMIN. The links go through a shuffle at the 
output of the first stage. In DIRSMIN, we connect the four output links from each source di- 
rectly to the corresponding positions in the new first stage (second stage earlier) with different 
priorities as shown in Figure If. 

The network can be partitioned into four subnetworks from Stage n — 2 to Stage 1 as shown 
in Figure lg. The SEs in stage 0 combine outputs from two subnetworks and then route to the 
destination. This partitioning is shown in Figure lg. The subnetworks are numbered from 0 
to 3. Notice the order of numbering. Let the subnetwork number be denoted by nin 0 . Each 
source is connected to exactly one input in each subnetwork. If the links in each subnetwork 
are renumbered from 0 to N/ 4 — 1, then a source S = s n -\s n -2 • • • SiS 0 is connected to link 

3 • • • SiS n _ 2 in each subnetwork. A destination d n _i • • • d 0 is connected to a pair of conjugate 
SEs, d n —\d n —2 ' ' ' d\ and d n —\d n —2 * * * d\. 
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In each subnetwork, there are exactly N packets, divided equally among four priorities. In 
one time slot, each destination can receive up to 2d packets from the two conjugate SEs. The 
number of output links of a SE at the last stage of the modified DIRSMIN does not have to be 
the same as the dilation degree and can be a design parameter d' decided based on the desired 
performance. We use the notation (d, d')-DIRSMIN to indicate a DIRSMIN with d dilation from 
stages ( n — 2) to 0, and d' dilation for the four output ports of each 2x4 SE in the last stage. 
Thus, 2 d' is the total number of links connected to each destination and d' can be varied in 
range of 1 < d' < d. Notice that a (d, d/2)-DIRSMIN has the same number of links connected 
to each destination as a d-dilated Omega network but has one less stage. 

3.2 Operation and Fault Tolerance. 

The network is operated in a time synchronized fashion as required in the ATM standard. In 
each time slot, every source sends four copies of an input packet with four different priorities 
to the four links with the destination as the routing tag. The priority of a packet from that 
source in subnetwork nin 0 is given by So^n-i ® nin 0 . For example, for the case of 16 X 16, nodes 
0, 1, 8, and 9 are all connected to dilated link 0 in each subnetwork. However, the priority of 
source 0 is 0, 1, 2, and 3, and the priority of source 9 is 3, 2, 1, and 0 in subnetworks 0, 1, 
2, and 3, respectively. We refer to priority 0 link as the primary link, priority 1 link as the 
secondary link, and so on. Priority 0 is the highest priority. The routing of packets is governed 
by the destination address bits d n _ 2 to di in stages (n — 2) to 1, respectively. SEs in the last 
stage route packets to 4 different output ports using bits d 0 d n _ i. Contentions can occur within 
each SE when the number of packets destined to one particular output port of a SE exceeds 
the dilation degree of that port. In such a situation, packets are dropped following the priority 
order. In case of contention among the same priority packets, packets are dropped randomly. If 
more than one copy of the same packet arrive at the last stage, only the copy with the highest 
priority is transmitted to the destination. 

The redundancy graph of the four priority copies from a source to a destination is shown in 
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Figure lh. At the last stage, priority 0 and priority 2 copies of a packet from the same input 
reach the same SE. Similarly, priority 1 and priority 3 copies also reach another SE. Low priority 
copies will be lost if there is contention in the last stage. Thus, the probability of priority 2 and 
priority 3 packets that pass through the network and become the only successful copies is small. 
This is also seen in our simulation and analytical results. Accordingly, a two priority scheme 
may be used, where each source sends out only two copies in each time slot if that level of fault 
tolerance is acceptable. In this case, each incoming packet has two routes instead of four and 
the scheme may be less robust than the four priority scheme under multiple faults. By sending 
redundant copies to four SE-disjoint subnetworks in one time slot, DIRSMIN can tolerate at 
least 3 SE faults in stages n — 2 to 1 and at least 1 SE fault at the last stage. It is robust in the 
presence of more SE faults in the network. In particular, the network works well even when one 
whole subnetwork becomes faulty. 

3.3 Implementation Issues 

A DIRSMIN is designed based on a d-dilated banyan network. Each SE from stage (n — 2) to 
1 in a (d, d^-DIRSMIN has the same size, 2 d X 2d, as a SE in a d-dilated banyan network. In a 
(d, d)-DIRSMIN, the size of the SEs at the last stage are 2d X 4d, which is twice as big as the size 
of the SEs used in a d-dilated banyan network. But a (d, d)-DIRSMIN has one less stage than 
a d-dilated banyan network. One way to compare the costs of different networks is to estimate 
SE complexity, defined by the total number of cross points in the switch. The number of cross 
points in a (d, d)-DIRSMIN is (log N — 2) x N/2 x 2d x 2d + N/2 x 2d x 4d = 2iVd 2 log N. This 
is the same as in a d-dilated Omega network. Thus, the two networks are equivalent in terms of 
switch complexity. A (d, d')-DIRSMIN with d' < d has fewer crosspoints than the corresponding 
Omega network with dilation d. 
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4 Performance Using Simulation Techniques 

We compare the performance of DIRSMIN and dilated omega network using simulation following 
commonly used assumptions. As stated earlier dilated Omega network has been shown to yield 
the best performance/cost among the competing architectures. It has also been shown that 
ELMIN type networks perform much better under non-uniform traffic conditions in [15]. So we 
will not deal with that here. In the simulation, we use a Bernoulli process with parameter A 
to describe the arrival of packets at a source node and assume that the input arrival process at 
each source node is independent The requested output port by any input packet is uniformly 
chosen among all output ports. 

The routing within the switch is as described earlier. The output collected at a destination 
is C and 1 — C //( A) is calculated as the loss probability where /(A) is the total number of input 
packets. The simulation results are analyzed for a confidence interval of 95% and the variation 
in the results are within 3% of the mean value. For clarity, we do not show error bars on the 
graphs for clarity except in Figures 8 and 9, where we compare simulation and analytical results. 

4.1 Performance Results Under Non-faulty Condition 

For the non-faulty case, A = 1 is used to simulate the performance of the network under heavy 
load, where each source always has a packet to send. We also vary both d and d! . Figure 2 shows 
the throughput obtained for different priority packets. Figure 3 depicts the results of packet loss 
probability of DIRSMIN and the dilated Omega networks. The loss probability decreases as the 
dilation degree increases for a 256 X 256 network. The simulations also show that the performance 
of d-dilated Omega network is in between that of (d, d/2)-DIRSMIN and (d, d)-DIRSMIN of the 
same size. So at equal complexity, DIRSMIN performs better than the corresponding size Omega 
network. We also fold that the number of packets with priority 0 passed by a (d, d)-DIRSMIN is 
greater than that passed by a d-dilated Omega Network. Therefore even without using multiple 
priorities, (d, d)-DIRSMIN performs better than d-dilated Omega. It is also observed that as 
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the dilation degree increases, the performance difference between the two priority scheme and 
four priority scheme decreases. In almost all the cases, the two priority scheme performs well 
enough in terms of throughput. 

4.2 Performance Results Under Faulty Conditions 

To study the effects of faults, we assume the following fault model: (1) any SE, including those 
at the last stage, can fail; and (2) faulty SEs are unusable. In our simulation, we assume that 
no fault diagnosis is performed and the packets sent to a faulty SE are lost. The corresponding 
faulty Omega network is also simulated for comparison purpose. We simulate two types of fault 
situations. The first one assumes that one arbitrary SE at the last stage is faulty. The second 
one assumes that one whole subnetwork is faulty. 

For the two fault situations, the packet loss probabilities of DIRSMIN and dilated Omega as 
a function of the dilation degree under full loads are shown in Figures 4 and 5. The simulations 
show that both a (d, d/2)-DIRSMIN and a (d, d)-DIRSMIN perform better than a d-dilated 
Omega in the presence of faulty SEs. In general, we find that a (d, d')-DIRSMIN, d/2 < d' < 
d, performs much better than a dilated Omega in the presence of faults. A (d, d)-DIRSMIN 
performs the best. 

The throughput of a dilated Omega network in the presence of faults is limited by how many 
SEs become faulty. Each SE fault disconnects a set of sources from a set of destinations. If s is 
the maximum number of faulty SEs at one stage under the assumption of uniform independent 
traffic patterns, the minimum loss probability of a dilated Omega network is s/(N/ 2) irrespective 
of the dilation degree. Thus, for a dilated 64 X 64 Omega network, the loss probability is limited 
to 1/32 = 0.031 in the first case where one SE is faulty at the last stage, and 8/32 = 0.25 in 
the second case where one fourth of the SEs at stage ( n — 2) to 1 are faulty. In a DIRSMIN, 
however, no single fault can disconnect any S/D pairs. As long as the full access property (i.e., 
each source is able to communicate with every destination) is maintained, different performance 
requirements can still be met by varying the number of dilations d and d' for a given switch size. 
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In the second set of simulations, we use several different input traffic loads. The packet 
loss probabilities as a function of input traffic load for the two faulty situations are shown in 
Figures 6 and Figure 7, respectively. As the input load decreases, the performance improves 
as expected. From the simulations, we also notice that the contributions from the lower two 
priorities further decrease as input traffic load decreases. We conclude that the lower two 
priorities make significant contribution to the throughput only when dilation degrees are small 
(for example, d = 4 for a 256 X 256 switch) and traffic load is high. Thus, for all practical 
purposes, the two priority scheme works as well as the four priority scheme. 

5 Performance Using Analytical Methods 

The throughput of dilated banyans under the independent uniform traffic assumption was cal- 
culated analytically by Kruskal and Snir [7]. In their calculations, only one class of packets was 
considered. The analytical model for multiple priority classes was studied by S. Tridandapani 
[9] where different sources generate different class of traffic. However, in DIRSMIN, each source 
sends the same packet to four subnetworks with multiple priorities. Within each subnetwork, 
the packets are from different sources. At the last stage, priority 0 and 2 copies or priority 1 
and priority 3 copies from the same source may reach the same SE. This causes traffic patterns 
to be correlated and the independent uniform assumption is no longer valid for priority 2 and 
priority 3 copies in the last stage. Furthermore, only the highest priority copy of the packet 
from each source is collected at each destination. The contribution of lower priority copies to 
the throughput is made only when there are no higher priority copies reaching the destination. 
Thus, the existing results can only be used within each subnetwork. A new analysis is needed 
to take into account the correlated traffic pattern at the last stage and destinations. 

In the following, we make the same assumptions as used in our simulation. The input 
traffic pattern from each source is assumed to be independent and uniform. The SE operates 
in synchronized time slots and all packets have the same length. Each slot corresponds to the 
transmission time of one packet across the network. 
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5.1 Dilated Banyan Network 


We first review the throughput and loss probability calculations for dilated banyan networks. 
In a banyan network, there is only one type of traffic. Let us define R m (j ) to be the probability 
that j packets are forwarded to a tagged output of a SE at stage to, where 0 < j < d for a 
dilation degree of d. For convenience, we renumber the stages 1 to n from the input stage to the 
output stage in the following manner. Notice that this numbering is different than that used in 
the previous section. 

The probability that j packets reach the input of a SE element at stage to + 1 is 

j 

*Sm + l (j ) — ^ ' Rm (*) Rm ( J *) 

i=0 

. The probability that / of these j packets are destined for a tagged output is 


V I 




\ 1 


2 “ j 


. Thus, 


R-rn + l {] ) — \ 


e r =1 s m+1 (k) 
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\ j 


'i-k 


Zlt d S m+ i(k)Z k z=d 
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y-k 


, j < d 


, j = d 


The boundary conditions at the input are given by the following. 


Ro(j) = 


X j=l 

l.o -A j=0 

0 j + 0, 1 


The packet loss probability is then given by 


n , 

loss -L ^ 


( 1 ) 


( 2 ) 
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5.2 DIRSMIN 


For analysis, we renumber the stages in (d, d^-DIRSMIN also from 1 to n — 1 from the input 
stage to the output stage in this section. Recall that the dilation degree of the internal link is d 
and each of the four output links at the last stage is d! . The derivation for priority 0 copies can 
be done in exactly the same way as in the case of dilated networks. We use notations R m (p)(j ) 
and S m (p)(j) instead of R m (j ) and S m (j), respectively, to specify priority p in the following 
discussion. 

From stage 1 to stage n — 2, the same equations derived in the previous section can be used 
to calculate i? m (0)(j) and »5' m (0)(j) for priority 0. However, the last stage (stage n — 1) in 
(d, d')-DIRSMIN performs 2x4 switching. The probability that / out of j priority 0 packets are 
destined to a particular output at the last stage is 


V 1 



1 

4 


j-l 


Thus, 


Rn-imj) 


Therefore, for j = d' 


zit d s n - 1 mk)z k z=d 



, j < d' 



( 3 ) 


Sn(0)(j) =i2 R n-l(0W)Rn-imj ~ *)• 

i=0 

To calculate the throughput of the lower priority copies, we define R m (i)(b, c) for (1 < i < 3) 
to be the probability that c packets of priority i and b packets of priority i — 1 or higher are 
forwarded to an output address of a SE at stage to. As defined earlier, priority 0 is the highest 
priority and priority 3 is the lowest priority. 

Since traffic to each subnetwork is identical and packets within each subnetwork are from 
independent sources, the independent uniform traffic assumption holds for all of the priority 
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copies from stage 1 to stage n — 2. The probability that / copies of priority i and g copies of 

priority i — 1 or higher reach a SE at stage m + 1 is 

/ a 

S m +i(i)(f,g) = j^j^R m (i)(s,t)R m (i)(f - s,g- t). ( 4 ) 


The probability that k of these g packets of priority i and j of these / packets of higher priorities 


are destined to a particular output is 


/ i ( g \ n 


Thus, for j < d and j + k < d, 


2d 2d g j „ \ /3\g j f \ / 1 \ / 

Rm + l{i)(j,k) = £ X) 5 m + l(*)(/,</) U) I (o) 


g = k 1=3 


For k < d and j + k = d, 
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At the last stage, each SE is performing 2x4 routing and each output port of a SE has 
d' links. The probability that k of g copies of priority i and j of / copies of priority > i are 
destined to a specific output is 

' ^ \ (-\(i - iV _j ( 9 \ (-] k (i _ -Y~ k 
v j ) “ 4j [ k ) UM “ 4j 

Then, Equations 5 and 6, respectively, change to Equations 7 and 8 as follows. 

For j < d' and j + k < d' , 
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The throughput for priority 0 packets is given by 

d’ 

T(0) = 2j2jR n -i(m)- 

]=0 

The factor of 2 accounts for the fact that there are two groups of d' links from two different SEs 
at the last stage to each destination. The probability that the priority 0 copy of a packet from 
an input node passes through the network is simply P n _i(0) = T(0)/A. 

The boundary conditions for priority i = 1,2 and 3 copies at the input are 


Ro(i)(j,k) 


( i 

V j 
o 


(1 - \y~ j X j+1 , 0<7 < *, k=l 
(1 - Xy-^X 1 , 0<7 < i, k=0 


, otherwise 


To calculate the throughput of priority 1 copies, we observe that a priority 1 copy from one 
particular source is accepted by its destination only when it successfully reaches its destination 
and the corresponding priority 0 copy does not make it to the destination because of contentions. 
We first calculate the probability that k (0 < k < d!) priority 1 copies reach one particular output 
of a SE at the last stage as, 

d' — k 

R' n - 1 (l)(k)= '£Rn-l(l)(j,k). 

]=0 

Then, 

d' 

Pn-l(l) =2'£kI? n -l(l)(k)/X 

k = 0 

gives the probability that a priority 1 copy passes through the network. The probability that a 
priority 1 copy of an input packet is the only copy at the destination is P n _i(l)(l — P n _i(0)). 
Notice that (1 — P n _i(0)) is the probability that the corresponding priority 0 copy does not 
reach the destination due to contentions. Even though the four subnetworks are isomorphic, 
the copies from each source are assigned with different priorities inside different subnetworks, 
forming different traffic patterns inside different subnetworks. Suppose both priority 1 and 
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priority 0 copies reach the destination with no loss, then the probability that priority 1 is the 
only copy at the destination is 0, which is correctly described by the expression above. The net 
throughput of priority 1 copies can be calculated as 

r(l) = 2 £ £ ( k ) S/(C„-,(1)(1 - a.-ifODHl - P n _,(l))*- ! '# Tl _ 1 (l)(fc) (9) 

k = 1 y = 1 \ y J 

The calculations of the throughput of priority 2 and priority 3 copies are further complicated 
by the fact that priority 2 (0) and priority 3 (1) copies of the same input packet, if both are 
successful, reach the same SE at the last stage. Thus, the independent uniform traffic assumption 
cannot be used directly for priority 2 and 3 traffic at the input of the last stage. We first calculate 
the number of distinct priority 2 (3) copies that do not have their corresponding higher priority 
copies reaching the same SE at last stage. This can be done in the same way as above in the 
calculation of distinct priority 1 copies. We use the independent uniform traffic assumption on 
these distinct priority 2 (3) copies to calculate the net throughput of priority 2 (3). 

The probability that a priority i (i = 2, 3) copy passes through the network up to the input 
of the last stage is 

P n - 2(0 = kRn- 2 (i)(j , k)/X. 

k = 0 j = 0 

The probability that k distinct copies of priority 2 and j copies of higher priorities reach an 
input port of a SE at the last stage is 

R n _ 2 (2)( J} k) = j2Rn-2(W,y) ( y ] [E n - 2 (2)(1 - p n _ 2 (0))f (1 - P n _ 2 (2)) y ~ k (10) 

y=k y k j 

The probability that k distinct copies of priority 3 and j copies of higher priorities reach an 
input port of a SE at the last stage is given simply by substituting quantities for priority 2 with 
those for priority 3, and quantities for priority 0 with those for priority 1 in the above equation. 

The independent uniform traffic assumption can now be applied on these distinct priority 
2 and 3 copies at the input of the last stage. Equation 4 can be used first to calculate from 
i? n _ 2 (2)(j, k) the probability S n -i(i)(j } k) ( i = 2, 3) that k distinct priority i copies and j higher 
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priority copies reach the input of a SE at the last stage. From that R n _i(i)(j,k) ( i = 2,3), 
which is the probability that k distinct priority i packets and j packets of higher priorities are 
forwarded to an output port of a SE at the last stage, can be calculated using Equation 7 and 
Equation 8. The total number of priority 2 copies reaching the destination, which are distinct 
from their corresponding priority 0 copies, is 

^(2) = 2 kR n (2)(j, k). 

k = 0 j = 0 

Similarly the total number of priority 3 copies reaching a destination, which are distinct from 
their corresponding priority 1 copies, is 

d' d' — k 

T(3)=2£ )U,k) 

k = 0 j = 0 

At the destination, some of these priority 2 (3) copies again are dropped because the corre- 
sponding priority 1 (0) copy of the same packet from the conjugate SE reaches the destination. 
The probability that this happens to priority 2 copies can be estimated by 

T(1)/(A-T(0)), 

where A — T(0) is the remaining traffic that is not passed by priority 0 copies and T(l) is the 
traffic passed by priority 1 copies among these remaining traffic. Finally, the net throughput of 
priority 2 copies is given by 


T(2) = T(2)(1-T(1))/(A-T(0))). 

We find that the net throughput of priority 2 copies is always a few orders of magnitude 
smaller than that of priority 0 copies for DIRSMIN, and therefore, it can be neglected in esti- 
mating the net throughput of priority 3 copies. Notice that out of AP n _i(l) priority 1 copies 
that get to a destination, T(l) are distinct from priority 0 copies. From this, the distinct priority 
3 copies which are accepted by a destination can be estimated by 

= 7(3) 
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Finally the throughput of the network is 


T = T( 0) + T( 1) + T( 2) + T( 3). 


The packet loss probability of DIRSMIN is 

Ploss 1 


T 

A' 


5.3 Numerical Results 

The results obtained above are checked against the simulation results and are found to be 
consistent. One comparison of results for a 64 X 64 network is shown in Figure 8 and 9. The 
performance of a (d, d')-DIRSMIN and a dilated Omega network at full load (A = 1) and 
variable loads is shown in Figures 10 and 11 for a 256 X 256 size network. We observe that the 
performance of a d-dilated Omega lies in between a (d, d/2)-DIRSMIN and a (d, d)-DIRSMIN. 
The performance of DIRSMIN improves as both d and d' increase and the role of the two low 
priorities diminishes. 

6 Reliability Analysis of DIRSMIN 

The all-terminal reliability, R(t) } is one of the most important measures of the effectiveness of 
a fault-tolerant scheme employing redundancy. This is the probability that there exists a path 
between each source and every destination. SE failures are random and independent events. 
Exact analysis of reliability in general is known to be NP-hard [18]. Normally only analytical 
bounds on reliability can be obtained. Monte Carlo simulations have been used to get more 
accurate numerical results. 

Due to the unique path property of Omega type banyan networks, the all-terminal reliability 
for such networks is r N / 2lo s N ^ where N/2 log N is the total number of SEs in the banyan network. 
The reliability diminishes very quickly with the increasing size of the network. The multiple 
paths of DIRSMIN tremendously help improve the reliability. We assume that all the SEs in 
the switch have the same reliability rit). 
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6.1 Network Reliability of DIRSMIN 

We note from Figure lg that DIRSMIN can be redrawn as four SE-disjoint sub-banyans linked 
at the last stage. The whole network consists of two disjoint subsystems, each consisting of 
two sub-banyans linked by N/4 SEs at the last stage. In Figure lh, one of the two subsystems 
consists of subnetworks 0 and 2 and the N / 4 SEs at the last stage that they are connected to, 
and the other one consists of subnetworks 1 and 3 and their corresponding SEs at the last stage. 
The reliability of two identical subsystems in parallel is 

R = 1 - (1 - R sub )\ (11) 

where R su b is the reliability of each subsystem. In the following, we first estimate the bounds 
for one subsystem and then use the above equation to get the bounds for DIRSMIN. 

To estimate the lower reliability bound of one subsystem, we observe that the full access 
property of a subsystem is maintained as long as (1) all the N / 4 SEs in the last stages are not 
faulty, and (2) at least one of the two complete subnetworks is fault-free. Thus, a conservative 
lower bound on the all-terminal reliability of one subsystem is given by 

Rsub > r N / 4 x (1 — (1 — r N ') 2 ). (12) 

In the above equation, N' = iV/8(log iV — 2) is the total number of SEs in each of the four 
subnetworks of DIRSMIN. 

To obtain the upper bound for a subsystem, we first observe that each SE in a particular 
stage from stage 1 to stage ( n — 2) has one conjugate SE within one subsystem. Two SEs are 
conjugate if they occupy corresponding positions in the two subnets to which they belong. The 
subsystem fails if a conjugate pair of SEs fails. The subsystem is operational as long as no 
conjugate pair of SEs fails and no SE in the last stage fails. In this estimation, since there are 
many combinations of failed SEs which cause the subsystem to fail other than conjugate SE 
pair, the system reliability is overestimated. Therefore, the upper bound for the all-terminal 
reliability of a subsystem is 

R S ub<r N/A ■ ( 13 ) 
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Finally, from Equations 11 and 12, we get the lower bound of the all-terminal reliability of 
DIRSMIN as 

R > 1 - (l - r N/4 x (1 - (1 - r w ') 2 )j 2 . 

From Equations 11 and 13, we get the higher bound of the all-terminal reliability of DIRSMIN 
as 

R < 1 - (l - r N/ 4 x (1 - (1 - r) 2 )^ 2 . 

6.2 Numerical Results and Comparison with SEN+ 

Using extra stages to create redundant paths between any S/D pairs has been proposed in the 
literature. The most robust and reliable of these networks, SEN+, has been analyzed in [19]. 
We, therefore, compare the all-terminal reliability of SEN+ with DIRSMIN. Analytical bounds 
on reliability for SEN+ have been estimated in [19]. In SEN+, two SE-disjoint subnetworks 
exist between the input and output stages. No SE faults at the first and last stage can be 
tolerated. The upper and lower all-terminal reliability bounds were obtained as 

R < r N x [1 - (1 -rff 

and 

R > r N x (1 - (1 - r w ') 2 ), 

respectively, where N' = (iV/4)(log N — 1) is the number of SEs in each of the two subnets. 

Figures 12 and 13 depict the comparisons of the all-terminal reliability of dilated ELMIN 
and SEN+ using the above relations. Figure 12 shows the dependence of network reliability R 
on SE reliability for a 64 X 64 network. Figure 13 shows the network reliability for different size 
networks using a fixed reliability SE with r = 0.999. In all cases, the lower reliability bounds 
of DIRSMIN are considerably higher than the higher reliability bounds of SEN+. Thus, the 
reliability of DIRSMIN is much higher than that of SEN+. The main reason for this is that there 
are four SE-disjoint subnetworks in DIRSMIN whereas there are only two in SEN+. DIRSMIN 
can tolerate faults at any stage including the last stage which is not the case for SEN+. 
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7 Conclusions 


We have developed a fault tolerant fast packet switch design, the (d, d')-DIRSMIN, which uses 
dilation to improve performance and fault tolerance of a network. This new network is capable 
of providing low packet loss probability and high reliability with very little hardware overhead 
compared to d-dilated banyan networks. Under non-faulty conditions, both simulation and 
analytical results show that a (d, d)-DIRSMIN performs better than the original dilated banyan 
network with the same SE complexity. Under faulty conditions, simulation results show that 
a (d, d')-DIRSMIN performs much better than a d-dilated Omega network. In these cases, a 
(d, d')-DIRSMIN yields monotonically decreasing loss probability as a function of the dilation 
degree, whereas a d-dilated banyan network cannot provide connection between certain S/D 
pairs and the loss probability is bounded depending on how many faults are present in the 
network. 

A multiple priority scheme allows us to explore alternate paths simultaneously which results 
in higher throughput and reliability under both fault-free and faulty conditions. A (d, d')- 
DIRSMIN tolerates multiple SE faults inside the network, including SEs at the input and output 
stages. It is shown that the reliability of DIRSMIN is considerably higher than that of SEN+. 
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(a) Cube-Connected MIN 


(e) Redrawn Shuffle-Exchange MIN 
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(b) Shuffle-Exchange (Omega) Connected MIN 


(f) Removal of the first stage and reconnection 



(c) Generic Extra-Link MIN (ELMIN) 




(g) Four sub networks in DIRSMIN 


(d) Omega-Connected ELMIN 



(h) Four priorities in redundancy graph 


Figure 1: Various Multistage Interconnection Configurations 








Figure 2: Simulation results for 256X256 (6,d’)-DIRSMIN and Omega Networks: Throughput 
for different priority schemes (curves for 2 and 4 overlap). 



Figure 3: Simulation results for 256X256 (6,d’)-DIRSMIN and Omega Networks: Packet loss 
probability at full load. 
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Figure 12: All-terminal reliability bounds for DIRSMIN and SEN+. N = 64 and r 
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Figure 13: All-terminal reliability bounds for DIRSMIN and SEN+. r = 0.999 




